AITopics | non-latin character

Collaborating Authors

non-latin character

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis

Struppek, Lukas (a:1:{s:5:"en_US";s:33:"Technical University of Darmstadt";}) | Hintersdorf, Dom (Technical University of Darmstadt) | Friedrich, Felix (Technical University of Darmstadt) | br, Manuel (Technical University of Darmstadt) | Schramowski, Patrick (Technical University of Darmstadt) | Kersting, Kristian (Technical University of Darmstadt)

Journal of Artificial Intelligence ResearchDec-18-2023

Models for text-to-image synthesis, such as DALL-E 2 and Stable Diffusion, have recently drawn a lot of interest from academia and the general public. These models are capable of producing high-quality images that depict a variety of concepts and styles when conditioned on textual descriptions. However, these models adopt cultural characteristics associated with specific Unicode scripts from their vast amount of training data, which may not be immediately apparent. We show that by simply inserting single non-Latin characters in the textual description, common models reflect cultural biases in their generated images. We analyze this behavior both qualitatively and quantitatively and identify a model's text encoder as the root cause of the phenomenon. Such behavior can be interpreted as a model feature, offering users a simple way to customize the image generation and reflect their own cultural background. Yet, malicious users or service providers may also try to intentionally bias the image generation. One goal might be to create racist stereotypes by replacing Latin characters with similarly-looking characters from non-Latin scripts, so-called homoglyphs. To mitigate such unnoticed script attacks, we propose a novel homoglyph unlearning method to fine-tune a text encoder, making it robust against homoglyph manipulations.

encoder, homoglyph, non-latin character, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.15388

AI Access Foundation

15388

Journal of Artificial Intelligence Research

Country:

Europe > Greece (0.14)
North America > United States (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
(37 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.91)

Add feedback

Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis

Struppek, Lukas, Hintersdorf, Dominik, Friedrich, Felix, Brack, Manuel, Schramowski, Patrick, Kersting, Kristian

arXiv.org Artificial IntelligenceFeb-13-2023

Models for text-to-image synthesis, such as DALL-E~2 and Stable Diffusion, have recently drawn a lot of interest from academia and the general public. These models are capable of producing high-quality images that depict a variety of concepts and styles when conditioned on textual descriptions. However, these models adopt cultural characteristics associated with specific Unicode scripts from their vast amount of training data, which may not be immediately apparent. We show that by simply inserting single non-Latin characters in a textual description, common models reflect cultural stereotypes and biases in their generated images. We analyze this behavior both qualitatively and quantitatively, and identify a model's text encoder as the root cause of the phenomenon. Additionally, malicious users or service providers may try to intentionally bias the image generation to create racist stereotypes by replacing Latin characters with similarly-looking characters from non-Latin scripts, so-called homoglyphs. To mitigate such unnoticed script attacks, we propose a novel homoglyph unlearning method to fine-tune a text encoder, making it robust against homoglyph manipulations.

artificial intelligence, homoglyph, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2209.08891

Country:

Europe > Greece (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
North America > United States > New York (0.04)
(35 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.87)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.74)

Add feedback